Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make base algorithms (cobyla, bobyqa, etc.) directly available in the C API #183

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

emmt
Copy link

@emmt emmt commented Apr 5, 2024

This patch simply moves the definitions of cobyla_c, bobyqa_c, etc. from prima.c to prima.h.

It is not that the new high level interface is bad (on the contrary, it may greatly simplify the life a C user) but the other functions (formerly prima_bobyqa, prima_newuoa, etc.) have their utility. For instance, it may be easier to write langage bindings for the base algorithms because you do not have to deal with C structures, it is just needed to wrap function calls with simple arguments like values, and pointers to functions or to arrays. In that respect, updating the Julia interface in PRIMA.jl would represent a lot of work and potentially new bugs (see this PR).

I would suggest to keep the same name as before, e.g. prima_bobyqa instead of bobyqa_c, but that's of lesser importance.

@emmt emmt requested a review from zaikunzhang as a code owner April 5, 2024 08:52
@zaikunzhang
Copy link
Member

zaikunzhang commented Apr 5, 2024

Thank you @emmt for the comment and proposal.

However, it is clear to me that we should not do this, namely to make both individual solvers (newuoa etc) and prima available at the same time. This is a bitter lesson learned from PDFO, the predecessor of PRIMA.

This has been discussed before.

Let me briefly explain why.

Providing both prima and individual algorithms (e.g., newuoa) will make the logic of the package unnecessarily complicated. It may look harmless at first glance. However, note that prima will call also newuoa. Should we differentiate the following two calls?

  • User calls newuoa
  • User calls prima, which in turn calls newuoa

The answer is yes. Why? Because we need to do some preprocessing and postprocessing before and after solving the problem. When newuoa receives a problem, we need to decide whether the problem has been preprocessed; before newuoa returns a result, we need to decide whether it needs to be postprocessed.

In other words, newuoa should always keep these questions in mind:

  • Who is calling me?
  • Whom am I reporting to?

This is too complicated.

This is why we should provide the user with only one interface prima to the solvers. The algorithm to be called will be indicated by algortihm. Individual solvers will not be exposed to users directly. I hope my explanation is understandable.

Indeed, this also applies to the higher-level interfaces including PRIMA.jl. However, I did not point this out early enough. My fault. Consequently, we should not provide a prima function in PRIMA.jl, since we have already exposed the individual solvers to the users. ( The complication has not appeared yet, because PRIMA.jl has not included necessary pre/postprocessing. )

Thanks and regards,
Zaikun

@emmt
Copy link
Author

emmt commented Apr 5, 2024

Hello @zaikunzhang,

This is not my point and if you look in the PRIMA.jl interface, you'll see that there are no such ambiguity about who calls who because the pre-processing would always be the same whether you directly call a specific base algorithm or call the high-level driver with arguments/options that make it decide to also call the same base algorithm. Besides, the arguments/keywords names are the same and have the same semantic in the driver and in the base methods. The only thing that changes is that, in the former case, the caller knows exactly which algorithm is used (and he may have good reasons to make this choice e.g. because he knows some of the pros or cons of the algorithm), while, in the latter case, the caller just know the result (which is perfectly suitable for most needs).

In other words, for a Julia user, there is no real direct access to the base algorithms, all public methods (prima, cobyla, bobyqa, etc.) implement a kind of high level interface with exactly the same pre-/post-processing and the arguments/keywords/options defining the problem have the same names and meaning.

BTW the Julia interface is very close to what is done in the high-level C driver. This is not surpring as we all actively participated to the discussion about the scaling of variables, automatic choice of the base algorithm, etc. Thiese discussion were very fruitful and useful to design waht should be the correct "high-level API" and, for me, the current Julia public API reflects quite faithfully that consensus.

Now I come to my points (sorry if I wasn't clear):

My first point is a technical reason. Interfacing the high level C driver is more difficult (at least in Julia) because it requires to deal with C structures (which is possible but more prone to errors sucha as misalignement etc.). It is faster and easier to keep most of the structure of the Julia interface and update it to account for the new arguments taken by the base algorithms (none of them are structures). Dealing with C structures may also be an issue for others bindings than Julia.

My second point is that PRIMA.jl has gained some interest in the Julia community (mostly thanks to you) and, if the Julia API happens to change from the Julia user point of view (which is unavoidable if only the high level driver is provided), it may discourtage these users and the developers who integrated PRIMA.jl in their packages.

Bonne journée ;-)

@zaikunzhang
Copy link
Member

zaikunzhang commented Apr 5, 2024

Hi @emmt ,

My second point is that PRIMA.jl has gained some interest in the Julia community (mostly thanks to you) and, if the Julia API happens to change from the Julia user point of view (which is unavoidable if only the high level driver is provided), it may discourtage these users and the developers who integrated PRIMA.jl in their packages.

First of all, I do NOT think the new C interface will make the public API of PRIMA.jl change, and, as you said very well, the public API of PRIMA.jl should not change since people are already using it (although I hope we provided a unified interface function prima rather than individual solvers newuoa etc. This is my fault.).

This is not my point and if you look in the PRIMA.jl interface, you'll see that there are no such ambiguity about who calls who because the pre-processing would always be the same whether you directly call a specific base algorithm or call the high-level driver with arguments/options that make it decide to also call the same base algorithm. Besides, the arguments/keywords names are the same and have the same semantic in the driver and in the base methods. The only thing that changes is that, in the former case, the caller knows exactly which algorithm is used (and he may have good reasons to make this choice e.g. because he knows some of the pros or cons of the algorithm), while, in the latter case, the caller just know the result (which is perfectly suitable for most needs).

This is because, as I mentioned in my last comment (copied below), PRIMA.jl currently does not include necessary pre/postprocessing yet.

Indeed, this also applies to the higher-level interfaces including PRIMA.jl. However, I did not point this out early enough. My fault. Consequently, we should not provide a prima function in PRIMA.jl, since we have already exposed the individual solvers to the users. ( The complication has not appeared yet, because PRIMA.jl has not included necessary pre/postprocessing. )

It will be difficult for me to make myself understood if my previous comment failed to make things clear to you. This is because we have different understandings about what an interface is. We may have a zoom meeting so that I can explain to you.

Thanks.

Zaikun

@emmt
Copy link
Author

emmt commented Apr 5, 2024

@zaikunzhang, I can see that we have different understandings, we may have a chat if you want.

Where are implemented the pre-/post-processing you are mentioning? The discussion you pointed does not make these clear to me.

You did not answer on my specific points...

@zaikunzhang
Copy link
Member

zaikunzhang commented Apr 5, 2024

@zaikunzhang, I can see that we have different understandings, we may have a chat if you want.

First of all, let me make it clear that I do not intend to change the public API of PRIMA.jl.

Where are implemented the pre-/post-processing you are mentioning? The discussion you pointed does not make these clear to me.

  • Preprocessing

    -- Preprocessing means validating the user's inputs, and defaulting parameters to proper values if the user does not specify them (correctly).
    -- You can simply understand it as "cleaning the inputs" before sending them to the solver.
    -- As an example, see preprima.m for the MATLAB implementation of the preprocessing. A complete and careful preprocessing takes more than 2000 lines of MATLAB code!

  • Postprocessing

    -- Postprocessing means validating the solver's outputs, and putting them into the form that are the most useful to the user.
    -- You can simply understand it as "cleaning the outputs" before sending them to the user.
    -- As an example, see postprima.m for the MATLAB implementation of the postprocessing. A complete and careful preprocessing takes more than 800 lines of MATLAB code!

  • The complication of preprima.m and postprima.m partially comes from the fact that the MATLAB interface exposes both prima and the individual solvers newuoa etc, which is a bad design inherited from PDFO. preprima and postprima need always remember "who is calling me" and "whom I am reporting to". I regret this design so much and I do not want it to happen again in any other language.

  • More concrete examples of pre/postprocessing.

    -- Scaling the problem and unscaling it before returning the result. PRIMA.jl already does this, although without explicitly calling it preprocessing and postprocessing.

    -- Removing linear equality constraints. This is needed because the LINCOA algorithm (the mathematical one, not the code) expects all linear constraints to be inequalities. We need to rewrite a problem with constraints $A_1 x = b_1, A_2 x \le b_2$ to a problem with $A y \le b$ by a certain linear transformation $x = By$. Without doing so, LINCOA still works (by treating $A_1 x = b_1$ as two inequalities), but not in the most efficient way. See Section 4.3 of the PDFO paper. This is preprocessing, and it obviously necessitates postprocessing, as the solver returns the solution in terms of $y$, but the user expects $x$.

    -- Removing variables that are almost fixed by bound constraints (i.e., $l \le x \le u$ with $l \approx u$). This is needed by BOBYQA. This is another preprocessing that necessitates a corresponding postprocessing.

    -- Projecting the initial point to the feasible region for linearly constrained problems. See For bound-constrained and linearly constrained problems, project the starting point to the feasible region PRIMA.jl#10

    -- Wrapping up the objective and constraint functions to handle failures and abnormal returns properly. See Wrap the objective & constraint function evaluation and catch exceptions if any  PRIMA.jl#15

There are many many others ... I hope you get a feeling why pre/postprocessing needs more than 2000 lines of code.

The MATLAB interface has implemented all of these. The Python interface will implement them as well, mostly by inheriting what PDFO has done. Without doing so, the interfaces would only be half-baked (better than nothing). See again my post about what an interface is.

I do acknowledge that proper pre/postprocessing is nontrivial to implement. Hence other interfaces may not implement all of these. In that case, we essentially leave the pre/postprocessing to the user, which is not ideal, but better than nothing.

You did not answer on my specific points...

My first point is a technical reason. ...

I totally agree with this one, although I have no much to say as I am ignorant about Julia. I hope the technical problems can be solved.

BTW the Julia interface is very close to what is done in the high-level C driver. This is not surpring as we all actively participated to the discussion about the scaling of variables, automatic choice of the base algorithm, etc. Thiese discussion were very fruitful and useful to design waht should be the correct "high-level API" and, for me, the current Julia public API reflects quite faithfully that consensus.

I agree, and I do not intend to change the public API of PRIMA.jl.

@emmt
Copy link
Author

emmt commented Apr 8, 2024

It is clear that all these are necessary but it would be much more consistent and logical if all the pre/post processings be done by the code that every langage binding is using in the end: the Fortran 90 version of the algorithm(s)... This is the only way to make sure that all intefaces behave the same. Plus this is the way to avoid duplicating the efforts and make easier interfacing PRIMA in other langages.

@zaikunzhang
Copy link
Member

zaikunzhang commented Apr 9, 2024

It is clear that all these are necessary but it would be much more consistent and logical if all the pre/post processings be done by the code that every langage binding is using in the end: the Fortran 90 version of the algorithm(s)... This is the only way to make sure that all intefaces behave the same. Plus this is the way to avoid duplicating the efforts and make easier interfacing PRIMA in other langages.

I agree with the idea in general. That is also what the Fortran backend tries to do.

However, some pre/postprocessing is too complicated, if not impossible, to do in Fortran due to the limitation / nature of the language. In that case, I do believe it is more reasonable to leave the pre/postprocessing to the high-level interfaces, where the language is usually more powerful and flexible.

Let me reiterate on my examples in this aspect.

  • More concrete examples of pre/postprocessing.
    -- Scaling the problem and unscaling it before returning the result. PRIMA.jl already does this, although without explicitly calling it preprocessing and postprocessing.

This is nontrivial in Fortran, due to the lack of lambda/closure functions. I do have an idea about how to implement it, just I do not have the time yet. See #182

-- Removing linear equality constraints. This is needed because the LINCOA algorithm (the mathematical one, not the code) expects all linear constraints to be inequalities. We need to rewrite a problem with constraints A1x=b1,A2x≤b2 to a problem with Ay≤b by a certain linear transformation x=By. Without doing so, LINCOA still works (by treating A1x=b1 as two inequalities), but not in the most efficient way. See Section 4.3 of the PDFO paper. This is preprocessing, and it obviously necessitates postprocessing, as the solver returns the solution in terms of y, but the user expects x.

This is nontrivial in Fortran because of two things.

  1. There is no lambda function in Fortran.
  2. We need a subroutine to do SVD. This is a rabbit hole I do not want to jump in. (Also, PRIMA will not depend on any external library such as LAPACK). In contrast, SVD is readily available in MATLAB/Python/Julia/R, which is the main reason why I prefer it to be done in high-level interfaces.

-- Removing variables that are almost fixed by bound constraints (i.e., l≤x≤u with l≈u). This is needed by BOBYQA. This is another preprocessing that necessitates a corresponding postprocessing.

This is nontrivial in Fortran due to the lack of lambda function.

-- Projecting the initial point to the feasible region for linearly constrained problems. See libprima/PRIMA.jl#10

This is nontrivial in Fortran because it needs a QP (quadratic programming) solver, which is readily available in high-level languages.

-- Wrapping up the objective and constraint functions to handle failures and abnormal returns properly. See libprima/PRIMA.jl#15

This is impossible in Fortran, which does not have the concept of exception handling.

I must stress that the above examples are only examples. There are many pre/postprocessing that cannot be done in the Fortran backend at all. Here are a few examples in MATLAB. I use MATLAB to avoid saying stupid things since I do not know Julia.

  • Validating inputs that may go wrong due to the flexibility of the language. For example, MATLAB users may input a rhobeg that is not a real number (e.g., a string), or input a maxfun that is not a positive integer (e.g., maxfun = 22.34). It will be sensible to detect such incorrect inputs and reset them to the default values (if applicable), and at the same time raise a warning to inform the user. This cannot be done on the Fortran side.

  • Validate the outputs. This is to make sure that the high-level interface reads the outputs from the Fortran backend correctly. For example, does the MATLAB code return an x that is a real vector with the same shape and size as x0? Does the MATLAB code return a constraint violation that is a nonnegative number? Does the MATLAB code return a constraint violation even though the problem is unconstrained (it should not)? This cannot be done on the Fortran side. You may argue that this kind of checking should be done only during the development, and it is not necessary once we are sure that the interface is working correctly. I do not agree, because I am never sure whether my code is working correctly, and believe me, doing this kind of checking is better than simply believing that the code is correct. Of course, if a certain checking is considered expensive, we should provide an opinion to turn it off (and do it only in the "debug mode").

You may tell me that the particular examples do not apply to Julia, but the point is that pre/postprocessing may be a result of the interface itself.

Thank you for thinking about it.

@emmt emmt mentioned this pull request Apr 9, 2024
@emmt
Copy link
Author

emmt commented Apr 9, 2024

OK then I think that this PR (#189) may be a better solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants